Bozhday Aleksandr Sergeevich, Doctor of engineering sciences, professor, sub-department of CAD systems, Penza State University (40 Krasnaya street, Penza, Russia), firstname.lastname@example.org
Timonin Aleksey Yur'evich, Postgraduate student, Penza State University (40 Krasnaya street, Penza, Russia), email@example.com
Background. The greatest scientific analytical interest is drawn by open access social online data, as they are directly linked to all kinds of human activity. However, the initial form of such data is rather unsuitable for automated applied processing and should be presented in a structured, convenient, human-readable form – a social profile. Social profile building is carried out through analyzing filtered initial online data from open sources. Dynamically unstructured data, including textual and multimedia information, cannot be handled by classical analytic means. It is necessary to define new analytical methods and approaches depending on types of information for the most effective and full use of initial data.
Materials and methods. The task of personal social profile data analysis is achieved through the use of mathematical tools of the set theory, Big Data software and NoSQL data storages, analytic tools for social media, as well as modern methods for analyzing multimedia data.
Results. It is suggested to divide initial social profile data into static and dynamic parts. The article considers methods of unstructured textual social profile data analysis, describes a technology of searching implicit dependences in texts using visual analysis and natural language processing means, as well as offers a review of techniques for analyzing multimedia content (graphics, sound).
Conlusions. The stageof textual and multimedia social profile data analysis is the most important in terms of results and quite complicated to implement. There is a possibility to partially automate the process of information analyzing through the use of visual analysis, natural language processing (NLP), neural networks and specialized algorithms. Thedata obtained provide a detailed in-depth review of social profile entities and their relations. It can be used for further deeper social researches.
Big Data, Data Mining, data analysis, multimedia, personal social profile, public data sources, unstructured data
1. Bozhday A. S., Timonin A. Yu. Modeli, sistemy, seti v ekonomike, tekhnike, prirode i obshchestve [Models, systems, networks in economy, engineering, environment and society]. 2016, no. 2 (18), pp. 112–119.
2. Bozhday A. S., Timonin A. Yu. Matematicheskoe i komp'yuter-noe modelirovanie estestvenno-nauchnykh i sotsial'nykh problem: materialy X Mezhdunar. nauch.-tekhn. konf. molodykh spetsialistov, aspirantov i studentov (Penza, 23–27 maya 2016 g.) [Mathematical and computer modeling of natural scientific and social problems: proceedings of X International scientific and technical conference of young scientists, undergraduate and postgraduate students (Penza, 23rd–27th May 2016)]. Penza: Izd-vo PGU, 2016, pp. 130–135.
3. Official site Neo4j: The World's Leading Graph Database. 2017. Available at: https://neo4j.com (accessed February 02, 2017).
4. Bol'shakova E. I., Klyshinskiy E. S., Lande D. V., Noskov A. A., Peskova O. V., Yagunova E. V. Avtomaticheskaya obrabotka tekstov na estestvennom yazyke i komp'yuternaya lingvistika: ucheb. posobie [Automatic processing of natural language texts and computer linguistics: teaching aid]. Moscow: MIEM, 2011, 272 p.
5. Tsentr kompetentsii po tekhnologii IBM Big Data [IBM Big Data competence center]. Moscow, 2014, 66 p.
6. Official site project Apache Hadoop. 2017. Available at: http://hadoop.apache.org (accessed February 02, 2017).
7. Tsentr kompetentsii po tekhnologii IBM Big Data [IBM Big Data competence center]. Moscow, 2014, 47 p.
8. Yakovlev V. E. Molodoy uchenyy [Young scientist]. 2011, vol. 1, no. 4, pp. 105–108.
9. Boykov I. V., Ivanov A. I., Kalashnikov D. M. Izvestiya vysshikh uchebnykh zavedeniy. Povolzhskiy region. Tekhnicheskie nauki [University proceedings. Volga region. Engineering sciences]. 2015, no. 4 (36), pp. 64–78.